Graph clustering with local search optimization: the resolution bias of the objective function matters most.

نویسندگان

  • Twan van Laarhoven
  • Elena Marchiori
چکیده

Results of a recent comparative experimental assessment of methods for network community detection applied to benchmark graphs indicate that the two best methods use different objective functions but a similar local search-based optimization (LSO) procedure. This observation motivates the following research question: Given the LSO optimization procedure, how much does the choice of the objective function influence the results and in what way? We address this question empirically in a broad graph clustering context, that is, when graphs are either given as such or are k-nearest-neighbor graphs generated from a given data set. We consider normalized cut, modularity, and infomap, as well as two new objective functions. We show that all these objectives have a resolution bias, that is, they tend to prefer either small or large clusters. When removing this bias, by forcing the objective to generate a given number of clusters, LSO achieves similar performance across the considered objective functions on benchmark networks with built-in community structure. These results indicate that the resolution bias is the most important difference between objective functions in graph clustering with LSO. Spectral clustering is an alternative to LSO, which has been used to optimize the popular normalized cut and modularity objectives. We show experimentally that LSO often achieves superior performance than spectral clustering on various benchmark, real-life, and k-nearest-neighbor graphs. These results, the flexibility of LSO and its efficiency, provide arguments in favor of this optimization method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Adapting Population-Based and Heuristic Algorithms for Dynamic Optimization Problems

In this paper, a general framework was presented to boost heuristic optimization algorithms based on swarm intelligence from static to dynamic environments. Regarding the problems of dynamic optimization as opposed to static environments, evaluation function or constraints change in the time and hence place of optimization. The subject matter of the framework is based on the variability of the ...

متن کامل

Optimizing Cost Function in Imperialist Competitive Algorithm for Path Coverage Problem in Software Testing

Search-based optimization methods have been used for software engineering activities such as software testing. In the field of software testing, search-based test data generation refers to application of meta-heuristic optimization methods to generate test data that cover the code space of a program. Automatic test data generation that can cover all the paths of software is known as a major cha...

متن کامل

Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach

  The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often falls into these trap and therefore cannot converge to global optima solution. In this paper, an efficient hybrid optimization algorithm is developed for solving this problem, called Tabu-KM. It gathers the ...

متن کامل

خوشه‌بندی خودکار داده‌ها با بهره‌گیری از الگوریتم رقابت استعماری بهبودیافته

Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...

متن کامل

Improved COA with Chaotic Initialization and Intelligent Migration for Data Clustering

A well-known clustering algorithm is K-means. This algorithm, besides advantages such as high speed and ease of employment, suffers from the problem of local optima. In order to overcome this problem, a lot of studies have been done in clustering. This paper presents a hybrid Extended Cuckoo Optimization Algorithm (ECOA) and K-means (K), which is called ECOA-K. The COA algorithm has advantages ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 87 1  شماره 

صفحات  -

تاریخ انتشار 2013